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Abstract. - We present a mathematical analysis of records drawn from independent random 
variables with a drifting mean. To leading order the change in the record rate is proportional to 
the ratio of the drift velocity to the standard deviation of the underlying distribution. We apply 
the theory to time series of daily temperatures for given calendar days, obtained from historical 
climate recordings of European and American weather stations as well as re-analysis data. We 
conclude that the change in the mean temperature has increased the rate of record breaking events 
in a moderate but significant way: For the European station data covering the time period 1976- 
2005, we find that about 5 of the 17 high temperature records observed on average in 2005 can 
be attributed to the warming climate. 



Introduction. — In current media coverage the oc- 
currence of record-breaking temperatures and other ex- 
treme weather conditions is often associated with global 
climate change. However, record breaking events occur 
at a certain rate in any stationary random process. In 
mathematical terms, a record is an entry in a time series 
that is larger (upper record) or smaller (lower record) than 
all previous entries pT-[3l. If the entries are independent 
and identically distributed random variables drawn from 
a continuous probability distribution, the probability Pn 
to observe a new record after n steps, hereafter referred 
to as the record rate, is simply Pn = 1/n, because all n 
values are equally likely to be the largest. Applying this 
result to maximal temperatures measured at a specific cal- 
endar day over a time span of n years, it follows that the 
expected number of records per year is 365/n, i.e. about 
12 records for an observation period of 30 years. Remark- 
ably, this prediction is entirely independent of the under- 
lying probability distribution, which may even differ for 
different calendar days. 

Despite considerable current interest in extreme climate 
events J4IIT4]. the subject of climate records has received 
relatively little attention. It is intuitively obvious that 
an increase in the mean temperature will lead to an in- 
creased occurrence of high temperature records, but at- 
tempts to detect this effect in observational data have long 
remained inconclusive p!5HT8] . Only very recently an em- 
pirical study of temperature data from the US found a 
significant effect of warming on the relative occurrence of 




Fig. 1: (Color online) Schematic of the evolution of the daily 
temperature distribution under linear drift of the mean. 



hot and cold records [19^. 

Here we present a detailed analysis of several large 
data sets of temperature measurements from both Ameri- 
can and European weather stations, as well as re-analysis 
datalll. We find that the observed increase in the number of 
high temperature records (and the corresponding decrease 
in the low records) is well described by a minimal model 
which assumes that the distribution of temperatures mea- 
sured on a given calendar day is a Gaussian with constant 
standard deviation a and a mean that increases linearly 
in time at rate v (see Fig(T]). This model is consistent 
with previous findings p!8 |[2QH23] and it is supported by 
our own analysis of the available data sets [24 , see Figl2] 



■"^The re-analysis approach combines meteorological observations 
from a variety of sources with advanced data assimilation techniques 
in order to create a continuous stream of observables on a three- 
dimensional grid, see 22 for details. 
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for an example. While changes in temperature variability 
have also been argued to be important in the generation of 
extreme temperature events |^|7j , we have failed to detect 
a clear systematic trend in a in the data [ Figl2](b)]. More- 
over, the increase in the mean supersedes a possible effect 
on cr, in the sense that the former leads to an asymptoti- 
cally constant record rate [25-28 whereas the latter only 
increases the record rate from 1/n to (In n)/n [29 . For 
these reasons we restrict ourselves to the simplest setting 
of a temperature distribution of constant shape and lin- 
early increasing mean. Although temperature fluctuations 
are well known to display long-term correlations [30 1 131 ] . 
the assumption that the daily temperatures are not cor- 
related is justified because individual measurements in a 
time-series are always one year apart (see [18 and the 
quantitative discussion below). 

Theory. — We begin by deriving an approximate an- 
alytic expression for the increase in the record rate Pn 
caused by a linear drift of the mean. In general, the record 
rate for a sequence of independent but not identically dis- 
tributed random variables Xn is given by [29] 



fn{x)dx Y{ ( / dXkfk{Xk) 
-(DO j^_-^ V-'— oo 



(1) 



where fn{x) denotes the probability density at time step 
n. Here we consider a drifting distribution of constant 
shape, which implies fn{x) = f{x — vn) with a common 
density f{x). This reduces ^ to 

/CXD ^~1 / nX-\-vk \ 

f{x)dx \[ / dxkf{xk) . (2) 

-DO ^^-L \^-00 / 

An explicit evaluation of (J2j) is possible for special choices 
of /(x), but in general it is only known that Pn converges 
to a nonzero limit P* = lim^^oo Pn when v > {) [25 - 
[28] . In the climate context the drift speed is expected 
to be small compared to the standard deviation of the 
distribution. We therefore expand ((Tj) to linear order in v, 
which yields 
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where F {x) is the cumulative distribution function of 
f{x). In [28 the integral in the second term is evalu- 
ated for various elementary distributions. For distribu- 
tions with a power law tail one finds that the correction 
term decreases for large n. On the other hand, for distri- 
butions that decay faster than exponential, the correction 
term generally increases with n. In the Gaussian case of 
interest here the integral can be evaluated in closed form 
only for n = 2 and 3, with the result 
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Fig. 2: (Color online) This figure summarizes the behavior of 
the distribution of daily maximum temperatures for data set 
EII. (a) Mean daily maximum temperature. Daily maximum 
temperatures were averaged over all stations and the entire 
calendar year. Diamonds show the average of dTmax for indi- 
vidual years and the full line is a sliding 3-year average. The 
regression line (dashed) shows a clear increase over the last 30 
years, (b) Standard deviation of daily maximal temperatures. 
To estimate the standard deviation, first a linear regression of 
dTmax was carried out for each station and each calendar day. 
The standard deviation for a given year was then computed by 
averaging the squared deviation of dTmax from the linear fit 
over all stations and all calendar years. Full line is a 3-year av- 
erage and the dashed line the result of a linear regression. We 
find no systematic trend in the standard deviation, (c) Distri- 
bution of daily temperatures on individual calendar days. The 
measurements were detrended and normalized for all time se- 
ries individually and then accumulated. The dashed line is the 
probability density of a standard normal distribution. 
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Using a saddle point approximation and the properties of 
the Lambert W-functions [32] to extract the behavior for 
large n one arrives at the asymptotic expression [28j 
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which is accurate for n > 7. For n = 4, 5, 6 the integral can 
be evaluated numerically. For a typical value ofv/a ^ 0.01 
and a time span of 30 years, (J5j) implies an increase of the 
record rate from 1/30 ~ 0.033 to 0.042, or an increase in 
the expected number of record events per year from 12 to 
15. In the following this prediction will be compared to 
empirical temperature data. 

European data. — The most comprehensive analy- 
sis was carried out for temperature data obtained within 
the EC AD project, which comprises a total of 752 Euro- 
pean stations [20]. The data consist of daily recordings 
for the minimum, mean and maximum temperature, as 
well as precipitation and snowfall. These stations recorded 
over time-spans of varying length between less than 10 and 
more than 200 years. Defective and missing entries were 
marked in the data sets. We restricted our study to time- 
series of daily minimum (dTmin) and maximum temper- 
atures (dTmax) which were to at least 95% reliable. This 
resulted in two sets of station data, one (set EI) consisting 
of 43 stations that recorded over the 100 year period 1906- 
2005, and a second (set EII) containing 187 stations that 
recorded over the 30 year period 1976-2005. Each station 
recorded 365 time series, and the results presented below 
constitute averages over all calendar days and all stations 
within the respective data set. 

Taken together, we thus had roughly 15,000 time series 
in data set EI and 68,000 time series in data set EII at our 
disposal. However, it is important to note that the effec- 
tive number of independent time series is much smaller. 
The number of independent series is limited by correla- 
tions both in space and in time. The correlations in space 
result from the fact that the time series from neighbor- 
ing stations are strongly correlated if they are less than 
1000 km apart [33j. As the spatial distribution of the 
stations over Europe was relatively homogeneous, we esti- 
mate an effective number of 12-15 independent stations for 
the European data. Furthermore, although daily temper- 
ature measurements in subsequent years can be assumed 
to be uncorrelated, time series recorded at individual cal- 
endar days that are close to each other are correlated as 
well. Based on the analysis of [31] we estimate that these 
correlations extend over a duration of approximately 10 
days, which implies that the number of independent cal- 
endar days is around 36. We therefore conclude that our 
analysis of the European data effectively comprises about 
400-500 independent time series. 

Figure [2] summarizes the analysis of the distribution of 
dTmax for data set EII. The mean maximal temperature 
is found to increase at rate v = 0.047 ± 0.003° C/yr, while 
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Fig. 3: (Color online) (a) Record frequencies for data set EI 
(1906-2005). Symbols show the average number of records per 
calendar day in forward (>) and backward (<) time direction, 
averaged over 365 calendar days and 43 stations. Full and 
short-dashed bold lines were obtained by an additional sliding 
9-year average. Long-dashed bold line shows the prediction 
Pn = 1/n for a stationary climate, (b) Record frequencies for 
data set EII (1976-2005). Symbols show the average number of 
records per calendar day for upper (A) and lower (y) records 
in forward direction, and for upper records in backward di- 
rection (<). Full lines were obtained by an additional sliding 
3-year average. Dotted line shows the prediction Pn = 1/n 
for a stationary climate, and long-dashed lines show the model 
predictions. 



the standard deviation is essentially constant with a mean 
value of cr = 3.4±0.3°C. The detrended temperature fluc- 
tuations are Gaussian to a good approximation. A corre- 
sponding analysis for data set EI yields a warming rate of 
V = 0.0085 ±0.003°C/yr and the same standard deviation 
as for data set EII. 

To directly test for correlations between daily tempera- 
tures, we computed the average two-point correlation for 
subsequent years after subtracting the drift and normal- 
izing. For both data sets the correlations were found to 
fluctuate around a small average value of order ±0.01 with 
a standard deviation of order 0.1. These values are consis- 
tent with a power law decay of the form found in [30l[31] . 

In Figure[3](a) we show results of the analysis of temper- 
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Fig. 4: (Color online) Mean record number at European sta- 
tions (1976-2005). Symbols show the average number of upper 
(A) and lower (y) records observed since 1976 at a given cal- 
endar year in the forward time analysis. Dotted line shows 
the prediction for a stationary climate, and dashed lines show 
the prediction for a constant rate of warming. Inset shows the 
results for the entire time-span from 1976 to 2005. 



ature records in data set EL The figure depicts the mea- 
sured daily record frequency for upper records of dTmax, 
obtained both from a forward analysis (where a record 
is the highest value of dTmax since 1906) and from a 
backward analysis (where years are counted backwards 
in time and records are defined with respect to the tem- 
perature in 2005). According to the prediction (|5]), the 
forward and backward record rates should lie symmetri- 
cally around the record rate 1/n of the stationary climate, 
which is consistent with the displayed data. Through- 
out the analyzed time span (with the exception of a short 
period around 1960 in which the climate was effectively 
cooling) the forward record frequency lies above the back- 
ward record frequency. This shows that the increase in 
the mean temperature significantly affects the statistics of 
records. The effect is particularly pronounced during the 
last two decades, where warming has been most signifi- 
cant (see the discussion of data set EII below). For the 
year 2005, the measured forward record frequency is about 
twice as large as expected for a stationary climate. Using 
the mean warming rate estimated over the entire 100 yr 
time period, only an enhancement of 40% is predicted by 
Eq.(|5]). This shows that the assumption of a constant rate 
of warming is not a good approximation for data set EI. 

Figure [3](b) displays the corresponding results for data 
set EII. Since the rate of temperature increase was rel- 
atively constant during this time period, we find good 
quantitative agreement between the data and the model 
predictions. The agreement is even more striking for the 
mean record number displayed in Figure |H In a station- 
ary climate the expected number of records observed over 
n years is 
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Fig. 5: (Color online) Record frequencies for data set AI (1881- 
2005). Full line shows the average number of upper records 
per calendar day after a 9-year sliding average, short- dashed 
line shows the corresponding frequency for lower records, and 
long-dashed line shows the prediction P^ = (1 — d/a)/n for a 
stationary climate with discrete measurements. 



where 7 ~ 0.5772156... is the Euler-Mascheroni constant. 
For a 30 year period this amounts to an expected record 
number of 3.98, which is to be compared to the observed 
number 4.24 for the upper records, and 3.66 for the lower 
records of dTmax. Together Figures [3] and S] provide 
a strong validation of our model. Using our estimate 
v/(j = 0.014 for data set EII, Eq.([5]) predicts that the 
increase in mean temperature has increased the rate of 
record occurrence by about 40% over the time period from 
1976-2005, which implies an additional 5 out of 17 records 
per year in 2005. 

Similar analyses were carried out for upper and lower 
records of dTmin. We find that the mean record number of 
dTmin behaves similar to the number of records of dTmax, 
with 4.32 upper records and only 3.66 lower records. In 
the backward time analysis we found 3.71 upper records 
for dTmax and only 3.62 for dTmin. The number of lower 
records was increased in the backward time analysis, which 
is in agreement with the results for the upper records. In 
summary, the number of lower records has decreased in the 
same manner as the number of upper records has increased 
(see FigEKb)). 

American data and discreteness effects. — The 

American data sets were extracted from a total of 1062 
stations [2r. Requiring again a reliability of at least 95%, 
we were left with 10 stations that recorded over the 125 
year time span 1881-2005 (data set AI) and 207 stations 
that recorded over the 30 year time span 1976-2005 (data 
set All). While the 10 stations of data set AI can be as- 
sumed to be independent, the number of effectively inde- 
pendent time series in data set All is again much smaller. 
The result of the record analysis was similar to that 
performed on the European data sets, with two impor- 
tant differences. First, owing to the continental charac- 
ter of the American climate, the standard deviation a is 
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considerably larger than in Europe, which, according to 
Eq.(j5j), implies a weaker effect on the record rate. For 
example, for data set All we estimate a warming rate 
of V = 0.025 ± 0.002°C/yr and a standard deviation of 
(J = 4.9 ± 0.1°C, which yields a ratio v/cr that is only one 
third of the value for data set EII. 

Second, the American data were rounded to full degrees 
Fahrenheit, whereas the European data were measured in 
tenths of degrees Celsius. As a consequence, the probabil- 
ity of ties is significant in the American data but negligi- 
ble for the European data sets. Here we count only strong 
records, which are broken only by a value that exceeds the 
current record. To account for these discreteness effects 
one computes the probability that a current record is tied 
in the nth event. For a small unit of discretization d <C cr, 
this probability is given by P^^® ~ d/{an). This leads to 
the probability for a record event with discretization as 
P^ ^ (1 — d/a)/n^ and summing over n the reduction of 
the number of strong records due to ties in a stationary 
climate is well described by [23] 



R^ ^ (Inn + 7)(1 - d/a) + 2d/a. 



(7) 



For the American data sets d = 5/9°C = 0.5555. .°C, 
which reduces the number of strong records per day ex- 
pected in a stationary climate over a 30 year period from 
3.98 to 3.75. In comparison, the observed number of 
records in data set All is equal to 3.86 in the forward 
analysis and 3.66 in the backward analysis. Again, warm- 
ing has significantly increased the number of records, but 
the effect is less pronounced than for the European sta- 
tions. The evolution of record frequencies in data set AI 
is shown in Fig|5l Note that a failure to account for the 
discreteness effect would lead to an apparent asymmetry 
between high and low records relative to the stationary 
case. Such an asymmetry was observed in the analysis 
of American temperature data in [19], where it was sug- 
gested that warming primarily reduces the number of low 
temperature records, while the effect on high records is 
less pronounced. 

Re-analysis data. — Taken together, the results pre- 
sented so far show that the increased occurrence of tem- 
perature records can be linked quantitatively to the ratio 
v/a of warming rate and temperature variability. Using 
the ERA-40 Re- Analysis data ^22], we were able to ex- 
tend this analysis to the spatial distribution of the record 
rate. The data consist of daily temperature series over 
the period 1957-2000 for 252 geographic locations in cen- 
tral Europe arranged on a regular grid, covering an area 
of about 3x10^ km . For each location the number of up- 
per records of the daily maximal temperature was deter- 
mined, and the results are shown in the form of a "record 
map" in Fig. [6](a). The comparison with a correspond- 
ing map of local values of the ratio v/a in Figl6](b) shows 
similar patterns, supporting our conclusion that v/a is 
a good (if not perfect) predictor for the increased occur- 
rence of records. Interestingly, the two most pronounced 




Fig. 6: (Color online) Spatial distribution of record number 
and normalized warming rate in central Europe based on re- 
analysis data (1957-2000). (a) Contour map of the number 
of records, computed from the 365 time series of daily high 
temperatures for each point on a rectangular grid of 14 x 18 = 
252 stations. The expected number of records in a stationary 
climate is 4.36. (b) Contour map of the spatial distribution of 
the rate of warming, normalized by the standard deviation. 



islands of high record occurrence {Rn > 4.8) in Fig. [6Ka) 
are attributed to different mechanisms. One, in southern 
France, reflects the exceptionally high rate of warming v 
in this region, whereas the other, over the North Sea, is a 
consequence of a low temperature variability a. 

An analysis of the seasonal variability of the record 
events in the (spatially averaged) re-analysis data leads 
to a similar result. We compared the seasonal distribu- 
tion of the difference between the mean record numbers in 
forward and backward time analysis to that of the ratio 
v/a, and found a close match between the two (Fig|7|). 
While the standard deviation shows a clear seasonal pat- 
tern with a pronounced maximum in winter, the seasonal 
variability of the warming rate is rather complicated. As 
a consequence, a simple seasonal pattern in the rate of 
record occurrence could not be identified. 

Conclusions. — In summary, by combining a simple 
mathematical model with extensive data analysis, we have 
conclusively established that the current rise in mean tem- 
perature significantly affects the rate of occurrence of new 
temperature records. While the majority of the high tem- 
perature records observed in Europe at the end of the 30 
year period from 1976-2005 would have occurred even in 
a stationary climate, the effect of warming is substantial, 
leading to an additional 5 out of 17 records per year. 

The key parameter governing the effect of warming on 
the occurrence of records is the ratio v/a, and to lead- 
ing order the change in record rate is linear in this pa- 
rameter. It is instructive to explore the future frequency 
of record-breaking events under the assumption that v/a 
will remain constant. The expression (|5]) then predicts 
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Fig. 7: (Color online) Seasonal distribution of the excess num- 
ber of temperature records compared to the seasonal distribu- 
tion of v/a in central Europe based on re- analysis data (1957- 
2000). The warming rate v and the standard deviation a for 
a given calendar day was computed as described above in the 
caption of FigO The warming rate for a given calendar day 
is the average over all stations of the slope of the correspond- 
ing linear regression, and the standard deviation is the aver- 
aged squared deviation from the linear trend over all stations 
and years. Full line represents the difference between the mean 
number of high temperature records in the forward time analy- 
sis and the backward time analysis for the entire 43 year period 
(left axis). Dotted line gives the seasonal distribution of v/a 
(right axis). Both lines were obtained by performing a sliding 
30-day average. 



that the enhancement of the record frequency (compared 
to the expectation P^ = 1/^ in a stationary climate) will 
continue to increase, up to the point where the expan- 
sion underlying ([5j) breaks down when the two terms be- 
come of comparable magnitude at a time roughly of order 
n* ~ (J /v. Beyond this time the record rate saturates 
at a constant value P*. Using our estimate v/cj ^ 0.014 
for data set EII, we find that P* ~ 0.033 for the Gaus- 
sian distribution. This implies that, towards the end of 
this century, daily high temperatures exceeding all values 
measured since 1976 will continue to occur in Europe on 
about 12 days of the year; at the same time the occurrence 
of low temperature records will essentially cease. 
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